Distributed Block-diagonal Approximation Methods for Regularized Empirical Risk Minimization
نویسندگان
چکیده
Designing distributed algorithms for empirical risk minimization (ERM) has become an active research topic in recent years because of the practical need to deal with the huge volume of data. In this paper, we propose a general framework for training an ERM model via solving its dual problem in parallel over multiple machines. Our method provides a versatile approach for many large-scale machine learning problems, including linear binary/multi-class classification, regression, and structured prediction. Comparing with existing approaches, we show that our method has faster convergence under weaker conditions both theoretically and empirically.
منابع مشابه
A Dual Augmented Block Minimization Framework for Learning with Limited Memory
In past few years, several techniques have been proposed for training of linear Support Vector Machine (SVM) in limited-memory setting, where a dual blockcoordinate descent (dual-BCD) method was used to balance cost spent on I/O and computation. In this paper, we consider the more general setting of regularized Empirical Risk Minimization (ERM) when data cannot fit into memory. In particular, w...
متن کاملThe proximal point method revisited∗
In this short survey, I revisit the role of the proximal point method in large scale optimization. I focus on three recent examples: a proximally guided subgradient method for weakly convex stochastic approximation, the prox-linear algorithm for minimizing compositions of convex functions and smooth maps, and Catalyst generic acceleration for regularized Empirical Risk Minimization.
متن کاملUn-regularizing: approximate proximal point algorithms for empirical risk minimization A. Derivation of regularized ERM duality
For completeness, in this section we derive the dual (5) to the problem of computing proximal operator for the ERM objective (3).
متن کاملAccelerated Mini-batch Randomized Block Coordinate Descent Method
We consider regularized empirical risk minimization problems. In particular, we minimize the sum of a smooth empirical risk function and a nonsmooth regularization function. When the regularization function is block separable, we can solve the minimization problems in a randomized block coordinate descent (RBCD) manner. Existing RBCD methods usually decrease the objective value by exploiting th...
متن کاملFast Learning from Non-i.i.d. Observations
We prove an oracle inequality for generic regularized empirical risk minimization algorithms learning from α-mixing processes. To illustrate this oracle inequality, we use it to derive learning rates for some learning methods including least squares SVMs. Since the proof of the oracle inequality uses recent localization ideas developed for independent and identically distributed (i.i.d.) proces...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017